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Abstract 

In this paper we study a variety of novel online algorithm problems inspired by the game Mouse- 
hunt. We consider a number of basic models that approximate the game, and we provide solutions to 
these models using Markov Decision Processes, deterministic online algorithms, and randomized online 
algorithms. We analyze these solutions’ performance by deriving results on their competitive ratios. 


1 Introduction 

Mousehunt is a Facebook game developed in 2006 by HitGrab Inc. The goal of the game is to catch mice 
using a variety of traps. Each species of mice is worth a certain amount of points and gold. Although 
collecting gold helps the player afford better traps, collecting points is the ultimate goal of the game. One 
particularly focus-worthy aspect of the game is that Ronza, a nomadic merchant, visits for short periods of 
time roughly once a year, and sells valuable exclusive traps during these unannounced visits. 

We introduce simple models of this game that involve optimizing the number of points gained over a 
finite time interval. While the problem’s overall description will resemble the classic ski rental problem, the 
finer details will differ, and we will be able to show different lower bounds on the competitive ratio. We will 
approach these problems using both deterministic and randomized online strategies to try to achieve the 
best possible competitive ratios. 

In this paper, we will use the convention that the competitive ratio r is always less than 1, i.e. if our 
algorithm earns value Ca , and the optimal offline algorithm earns value Copt , then Ca > r • Copt- 

We begin by proposing a simple model for Mousehunt, where we start with a basic trap that can selectively 
catch mice worth one point or one gold. Assuming that we don’t know when Ronza will arrive next, and 
that we have some estimate x of the benefit we gain from Ronza’s traps, we are able to prove that it is 
optimal to hunt for gold if and only if the ratio of the gold cost of the trap c to the timespan T satisfies 
c/T < 1 — That is, the potential amount of benefit we can gain is worth it iff the cost of the trap is not 
too high. In this case, we obtain a competitive ratio of 1/^/x. 

If we randomize our strategy, it turns out we can do better than 1/2-competitive on average. 
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2 A Simple Model 

2.1 Modeling Mousehunt 

In the game Mousehunt, players attempt to catch different types of mice which give players rewards of points 
and gold when caught. The reward can therefore be represented as a vector (p,g) where p is the amount of 
points and g is the amount of gold. In the real game, players can use different trap setups to increase their 
catch rates against certain sets of mice, and the player can select which mice to target by changing their 
trap setup (by arming certain types of cheese or traveling to certain locations). Traps can be purchased for 
certain amounts of gold from trapsmiths or from Ronza, who visits the mouse-hunting land of Gnawnia once 
a year. 

2.2 Initial Problem 

Suppose that there are three traps available, Ai, A 2 , and B. Using trap A\ will allow the player to catch a 
mouse worth 1 gold, and using trap A 2 will allow the player to catch a mouse worth 1 point. Using trap B 
will allow the player to catch a mouse worth x points for x > 1. Traps A\ and A 2 are available at the start 
of the game while trap B can only be purchased from the wandering merchant Ronza. There are T time 
steps in total. Ronza appears for a single time step at some unknown time y 1 and sells trap B at cost c. 
Before each time step, and also at the end, the player makes a decision to lay down either trap A- t . trap A 2 , 
or B , and then immediately reaps the rewards of their choosing. This setup poses two different but related 
problems: one, to determine an optimal trap-choosing strategy to maximize the expected payoff given some 
distribution assumptions, and two, to find a strategy that maximizes the competitive ratio. 

3 Markov Decision Process Analysis 

3.1 Distribution and Optimality 

For the rest of this section, we will assume that Ronza’s arrival time is distributed uniformly among 1,2,..., T. 
We can perform similar analyses with different distributions, but it will be most clear to analyze the uniform 
distribution case. 

We establish the following lemma concerning optimal online algorithms which solve this problem. 
Lemma 3.1. The online algorithm, if it is optimal, can he assumed to take on one of the following formats: 

• Only hunt for the mouse worth 1 point. 

• Hunt for the mouse worth 1 coin until c coins are gathered or until the merchant arrives. Purchase 
the trap B if possible when the merchant arrives, and only hunt for mice worth a maximum amount of 
points afterward. 

Proof. First, suppose that the online algorithm 0\ at time t decides to earn points and at the next time 
step t + 1 decides to collect gold. Then, consider the alternative online algorithm 0 2 that collects gold at t 
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and earns points at t + 1. Both of these choices are conditioned on Ronza having not arrived yet, as it is 
clear that the only right decision after Ronza arrives is to earn points. If Ronza appears after time t + 1 or 
before time t, both algorithms perform equally well. If Ronza appears between times t and t + 1, then the 
algorithm 0\ will have performed worse. Thus, any deterministic optimal online algorithm will collect gold 
at the beginning, if at all, and hence take on one of the two prescribed formats. □ 

3.2 Markov Decision Process 

One approach for solving the initial problem is to view it as a Markov Decision Process. For a fixed x , define 
/(c, T) to be the optimal decision when there are T time steps remaining and we require c more coins for 
Ronza’s trap; the lemma tells us that /(c, T) can only take on two forms: aim for points, or aim for c coins. 
Define g(c, T) to be expected payoff in points when pursuing this optimal decision. The base cases go as 
follows. When c = 0, we have /(c, T) as aim for points and g(c,T ) = ^-^(T + 1), because Ronza comes 
in the middle of the time steps on average. When c = 1, we have g(c,T) as max(^±lT, T + 1): the latter 
payoff is achieved by aiming for points, and the former payoff is achieved by aiming for a single coin. When 
T = l,c > 1, we have /(c, T) as aim for points and g{c,T ) = 2. Then: 

Theorem 3.2. In the state (c, T) where T > 1 and c > 0, /(c, T) and g(c, T ) can be determined by comparing 
T + 1 and g(c — 1, T — 1) + 1. If the latter is larger, then it is the expected payoff, and the optimal move 
/(c, T) is to aim for c gold. Otherwise, if the former is larger, then it is the expected payoff, and the optimal 
move f(c,T ) is to aim for points. 

Proof. By the lemma, we may reduce the optimal decision to two possibilities: aim for points, or aim for 
c coins. If we decide to aim for points, the payoff will be 1 + g(c,T — 1). However, it is not rational to 
aim for coins in future time steps if we do not aim for coins now. Hence, the payoff must evaluate to 
1 + g{c,T — 1) = 1 + T. Otherwise, if we decide to aim for c coins, then our payoff will depend on whether 
or not Ronza arrives in the next time step. 

If Ronza arrives in the next time step, then we have no choice but to gain 1 point after all remaining 
time steps, accruing a total of T points. This occurs with probability y. 

If Ronza arrives after the next time step, then her arrival time will be uniformly distributed among the 
remaining T — 1 time steps, just as in the state (c — 1, T — 1). This reduction allows us to conclude that the 
payoff is g{c — 1, T — 1), and occurs with probability 

Thus, the expected payoff when aiming for c coins is ^-ffg(c — 1, T — 1) + ^ • T = ^fg{c — 1, T — 1) + 1. 
Between this payoff and the payoff of T + l from greedily amassing points, whichever one is larger will dictate 
both f(c,T) and g(c,T). □ 

Theorem 3.3. Let r=if. Then asymptotically, f(c,T) will dictate that it is optimal to catch mice to aim 
for c gold iff r < 1 — . 

Proof. We will no longer keep track of unneeded additive constants as we are determining asymptotic be¬ 
havior. Suppose that Ronza’s arrival time is y. We already know from the lemma that the alternative to 
catching mice to aim for c gold is catching mice to aim for points. The latter yields a payoff of T (in fact, 
T + 1). Now we will compute the payoff of the former in a non-recursive fashion. 
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• Case 1: Ronza comes after time c. 

This means that c < y, and decisions made according to this algorithm will earn y — c + x(T — y) 
points: y — c points from mice worth 1 point each, and x[T — y) points from mice worth x points each. 
The average value of y here is g AA -. 


• Case 2: Ronza comes before time c. This means c > y, and the player can never afford trap B , so the 
algorithm will simply catch mice worth 1 point for all remaining time steps, yielding T — y points. The 
average value of y here is 


This means that if we decide to aim for c coins, our expected gain in points is c+x(T— + 

£ (T — |). Equalizing the payoffs of the two decisions allows us to determine the asymptotic boundary: 


r = —+ -(T- c -) 
T 2 2 T y 2’ 

1 = (! - r) • (i - r)( a ^- L ) + r • (1 - r/2) 


0 = — r — xr + 


x — 1 


0 = xr 2 — 2 xr + x — 1 


Solving this quadratic and taking the root such that r < 1 yields r = 1 — . Therefore, the asymptotic 

behavior of this Markov Decision Process can be summarized as: until Ronza makes an appearance, aim for 
c coins if the state (c, T) satisfies ^ < 1 — -^=, and aim for points otherwise. □ 

Below, we present a few values of g(c,T) when x = 2, that is to say, when trap B is twice as effective 
as trap A. Bolded entries represent states in which f(c, T) dictates that aiming for c coins is strictly better 
than aiming for points. Notice that the boundary between the two strategies being optimal closely follows 
the line c = (1 — as shown in the theorem above. 


(c,T) 

1 
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It is not difficult to generalize this table for all x. Note that for a given (c, T), we can use dynamic 
programming to fill in this table and compute /(c, T) and g(c,T) in O(cT) time. However, for large values 
of c, T, we can bypass Markov Decision Processes altogether and use Theorem 3.3 to compute them in 0(1) 
time. 

Our analyses so far relied on the distribution of Ronza’s arrival time being uniform. If the distribution is 
not known, however, then Markov Decision Processes cannot be used to model the decision problem. Next, 
we will show a solution to the algorithm online problem against any adversary. 
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4 Competitive Analysis Model 

In this section we analyze online algorithms to solve the initial problem without a known probability dis¬ 
tribution. We examine both an optimal deterministic online algorithm, and a randomized online algorithm 
that achieves the best possible asymptotic competitive ratio. 


4.1 Competitive Ratio Analysis: Deterministic 


If the online algorithm decided only to earn points, the worst case scenario is that the optimal offline algorithm 
was to earn c gold and buy trap B when Ronza arrived to earn x points per subsequent mouse. This means 
that c < y, and the offline algorithm would garner ma x(y—c+x(T—y),T) points. If ma x(y—c+x(T—y),T) = 
T, the competitive ratio is 1. We now analyze the case when max(y — c + x(T — y), T) = (y — c) + x(T — y ), 
where the competitive ratio of the two algorithms is ^ y _ c ^ x ^ T _ y ^ . 

We can find an upper bound on the reciprocal of this fraction to find a lower bound on this ratio. Let 
r = ^. Then the reciprocal of the ratio can be upper bounded as follows. 

fa-dyer — y) =x _ c +|) ^ 

< Z - / + c = x (l - = z(l - r) 

where we used the fact that (1 — x) is negative and c < y. The competitive ratio is thus lower bounded by 

l 

x(l—r) ' 

If the online algorithm decided to collect gold to try to buy the trap, the worst case scenario is that 
Ronza appears before the algorithm has collected enough gold to purchase the trap. In this case, c> y, and 
the online algorithm would earn T — y points (it would earn gold until the merchant arrived at time y , after 
which it would earn points) and the offline optimal algorithm would earn T points. The competitive ratio 
here is Here we find that the ratio is 


V c 

1 - I- > 1 - = 1 -v 

T T 


Thus, given knowledge of x, c, T , we can choose our strategy based on the larger of the values and 

1 — r to achieve the best competitive ratio. Since these two expressions vary in opposite directions as r 
increases, the worst case occurs when the two expressions are equal, i.e. 


1 

x(l — r) 


(1-r) 


1 — r 


1 


yfx 


Thus, since our analysis was worst case, we obtain a tight lower bound of ^ on our competitive ratio. Thus 
this deterministic algorithm is always competitive for all values of r. 


4.2 Competitive Ratio Analysis: Randomized 

Here we describe and analyze a randomized online algorithm that achieves a worst case competitive ratio of 
1/2 for all values of x, c, and T. 
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First, if x(T — c) < T, then it is always optimal just to go for points because going for the trap would 
result in y — c + x(T — y) < x(T — c) < T points at the end. This information is available to the online 
algorithm so the online algorithm is exactly the same as the offline one in this case. 

Thus, we just have to analyze the case that x(T — c) > T. As before, any optimal online algorithm will 
either only aim for points or it will collect gold until Ronza arrives or until it has collected c gold, after 
which it will collect points and buy Ronza’s trap if possible. 

The randomized algorithm is to choose to try to gather gold for Ronza’s trap with probability q and 
to choose to aim for points only with probability 1 — q. We will determine q later based on x and r = 
Call the gold gathering strategy S g and the point gathering strategy S p . Let S denote the randomly chosen 
strategy. 

Given our choice of q, the adversary’s goal is to choose Ronza’s arrival time y so that the competitive 
ratio is minimized. Our goal is to choose a q to maximize this minimal value. 

If y < c then the competitive ratio is 

iiS = Sg 
\l if S = S P 

If y > c then the competitive ratio is 

J1 if S = Sg 

{ C y-c)+x(T-y) lf S = S P 

Thus the overall competitive ratio, denoted by R(x,c,T ), is 


R(x, c, T) = 


<7 • (t^) + (! - 9) -y<c 

g + (! - Q) {y - c) + x (T-y) ■y> c 


Suppose q has been chosen already. If the adversary chooses y < c, then R(x, c, T ) is minimized when y 
is maximized, or when y = c — 1 « c. If y > c, then minimizing R{x , c, T) involves maximizing 


(y — c) + x(T — y) = xT — c + y{ 1 — x) 


which occurs when y is minimized since x > 1. Thus in this case the adversary will choose y = c. 
Thus we get that 


R{x, c, T) = 


■V <c 
:y>c 


<1' (^r 5 ) + (! - 9) = 1 - tQ 

g + (1 - q) X ( T _ C ) = a,(T-c) + x(T—c) ) <l 

Note that the coefficient of q is negative in the case y < c and the coefficient of q is positive in the case 
y > c since x(T — c) > T. As q increases, R(x, c, T ) goes down in the case y < c and goes up in the case 
y>c. 
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The two possible expressions for R(x, c, T) thus change in opposite directions as q is varied. Thus, if we 
want to maximize the minimum of these two numbers, we have to set them equal. This gives 


T 


+ 1 - 


T 


x(T - c) 

*=► 9(1- 


x(T - c) 
T 


<1 = l ~T q 


:(T — c) T 


^ = 1 - 


T 


:{T-c) 


•:{T - c)(T) -T 2 + cx(T - c)\ -T + x(T - c) 


x(T-c)(T) 


q = 

q = 


x(T - c) 

-T 2 + xT(T - c) 
x(T - c)(T) - T 2 + cx{T - c ) 
xT 2 — T 2 — cxT 


xT 2 _ T 2 _ C 2 X 

This gives the optimal value of q given x, c, and T. If we let r = we can rewrite this as 

x — 1 — rx 

q = 


x — 1 — r 2 x 


We can also rewrite the condition x(T — c) > T as 


x{l — r) > 1 

> x — xr > 1 
x — 1 

=> - > r 


x 

Plugging this value of q back into R(x,c,T), we get that the competitive ratio of this randomized 
algorithm is 

c xr - r - r 2 x (x - 1)(1 - r) 

R{x, c, T) = 1 - -q = 1 - rq = 1---= - - - 5 — 

1 x — 1 — r^x x — 1 — r z x 

To find a lower bound on the competitive ratio of R , we need to minimize R(x, c, T) over all values of x, 
c, and T. Since R only depends on the ratio r = ^ asymptotically, we can just consider R(x,r). 

To minimize R , we compute the partial derivative 

dR —(x — l)(x — 1 — r 2 x) — (—2 rx)(x — 1) (1 — r) 
dr (x — 1 — r 2 x) 2 

(x — 1)(— (x — 1 — r 2 x) + 2 rx — 2 r 2 x) 

(x — 1 — r 2 x) 2 

[x — 1)(—r 2 a; + 2rx — x + 1) 

(x — 1 — r 2 cc) 2 

I? is minimized when the partial derivative is 0, or at the endpoints. At the endpoint r = 0 we get 
R(x , r) = 1, and at the endpoint r = X ~ 1 , we know x — 1 = rx, so 

n/ x rx(l - r) 

= 1 

rx — r z x 

When the partial derivative is 0, we have that 

{x — 1)(1 — a:(l — r) 2 ) = 0 
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Since x > 1, solving this gives 


r = 1 — 


1 


\/x 


This value of r is valid because we know that r = 1-7= < 1 — - 

y/X X 


R(x, r) we get that 


R(x, r) 


l 1 V*)) 

x - 1 -{ 1 -vs) x 


= . Plugging this value of r back into 


1 

2 



We can find the minimum value of this over all x. If we let x range from 1 to oo we see that R[x, r) > ^ 
for all x. 

One way to intuitively derive the same competitive ratio is that the optimal offline algorithm clearly 
either goes for gold for Ronza’s trap or goes for points the entire time. Thus, a basic randomized algorithm 
would just flip a fair coin. If the coin landed heads, it would choose to go for gold, and if it landed tails, 
it would choose to go for points. ^ of the time the randomized algorithm will match the optimal offline 
algorithm, so we are at least \ competitive. 

The result we showed was a proof that the algorithm was \ competitive for all values of x, c, and T. 
However, in reality, for most values of x , c, and T, the competitive ratio our randomized algorithm obtains 
is R(x, r), which, for most values of r, is much better than the ratio obtained by single randomized coin flip 
algorithm. As an example, consider the graphs shown in Figure[l]of R{x,r) for the cases x = 4 and x = 100 
for the appropriate range of r € (0, ^=i). 




Figure 1: Left plot: x = 4. Right plot: x = 100. 


4.3 Proof of Optimality of Randomized Algorithm 

We will conclude this section with a proof that for any randomized online algorithm, an asymptotic com¬ 
petitive ratio of \ over all values of x, c, and T is optimal using Yao’s minimax principle. 

Yao’s minimax principle states that the expected cost of a randomized algorithm on the worst case input 
is no better than the worst case probability distribution of inputs for a deterministic algorithm that works 
best for that worst case distribution. Thus, we simply have to present a probability distribution such that 
no deterministic algorithm can perform very well for it. 

It is not difficult to find such a distribution. Given the parameters x , c, and T, we can define two possible 
inputs (which are just values of y , Ronza’s arrival time) and give them as inputs to the deterministic algorithm 
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with probability \ each. The first input is y = c and the second input is y = c — 1. Then, the deterministic 
algorithm can be analyzed as follows. 

• Case 1: The deterministic algorithm gathers coins for the first c steps (assuming Ronza doesn’t appear). 
Then, for the input y = c, the deterministic algorithm will be 1-competitive. For the input y = c — 1, 
at time t = c — 1 the algorithm will realize that it can not buy the trap, and so it will then gather 
points afterwards. This gives us an overall competitive ratio of 


• Case 2: The deterministic algorithm does anything else. Then, this means at time t = c, the algorithm 
will have less than c gold. Then, for the input y = c, the algorithm will fail to buy the trap. For 
y = c — 1, the algorithm will also fail to buy the trap, and at best it will be 1-competitive. The 
algorithm will have earned at most c points in the first c time steps. The optimal offline algorithm will 
buy the trap for c gold in the case t = c. Thus the overall competitive ratio is 

^ + 1 )-Ki(^ + 1 ) = H 1 + S(T^b 

In that special case that r = 1 — A , we know that the competitive ratio of this deterministic algorithm 
will be equal to ^ ^1 + ^ 75 ^- Thus, as x approaches infinity, any optimal deterministic algorithm for this 
input will be at best | competitive, so | competitive is as good as any randomized algorithm can be for the 
original problem. 


1 / c + T-c 

2 \(y-c) + x(T 


5 Unknown Cost, Fixed Arrival Time 

In this section, we explore a variant of our initial problem. Now, suppose that Ronza’s arrival time y is 
known, but the cost c of buying trap B is unknown. 

5.1 Online and Offline Algorithms 

First, we consider the offline algorithm. If c is known, then this reduces to our initial offline problem. 
Therefore, if c < y, the optimal algorithm will get rna x(y — c + x(T — y),T) points, and if c > y, then the 
algorithm gets T points. 

Any deterministic online algorithm can be characterized by a number m < y 1 where it collects gold for m 
of the first y time steps, and hunts for points otherwise. It will buy Ronza’s trap at time y if it is affordable 
at time y. Now, there are two possible outcomes: 

• If the algorithm manages to collect c gold, i.e. in > c, then it is always optimal to buy trap B if 
possible. Then the algorithm earns a total of y — m + x(T — y) points. 

• If to < c, then the algorithm earns T — to points. 
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In order to find a competitive ratio for this online problem, we consider the problem from the perspective 
of the adversary. For given choices of c and m, we have the following cases and known ratios for the online 
and offline algorithms: 

1. If c > y, we have a ratio of T 7 r m 

2. If m < c < y, and y — c + x(T — y) <T, we have a ratio of 

3. If m < c < y, and y — c + x(T — y) > T, we have a ratio y _ c 7 ^~^_ y j 

4. If m > c, and y — c + x(T — y) <T, we have a ratio of v~ m +x(T-y) 

5. If m > c, and y — c + x(T — y) >T, we have a ratio of ™+x(T~y) 

As the adversary, our job is to design c for fixed to in order to produce the smallest ratio. Note that since 
cases 1 and 2 produce the same ratio, and there always exists a c to satisfy case 1 , we need not consider case 
2. Similarly, because 

y — to + x(T — y) >T — m •£=> (x — 1)(T — y) > 0 
is true for any to, case 4 is redundant given case 1. 

Now notice that as the ratio in case 5 is an increasing function of c, its minimum is achieved at c = 0. 
Thus, if we compare the ratios in cases 1 and 5, we have 

T — m y — m + x(T - y) 2 2 

- < - - -r— 4=> 1 y + xl — xly — my — xlm + mxy <ly — Im + xl — xyl 

T y + x{T-y) 

Canceling terms and rearranging, we obtain 

0 < m(x — 1)(T — y) 

which is always true. Thus, case 5 is redundant given case 1. 

It remains to compare the ratios of cases 1 and 3, and for an online algorithm’s choice of m, it is the 
adversary’s goal to choose c to obtain the smaller ratio. Note that the value of c that minimizes ratio 3 is 
c = to + 1 , assuming such a c satisfies the constraints, so we are left with two ratios to compare: T ^ r m from 
case 1, and - T r, m ,rr —y from case 3. 

’ y-m-l+x(T-y) 

In words, case 1 corresponds to when both offline and online algorithm cannot buy Ronza’s trap, and 
case 3 corresponds to when the offline can and does buy the trap while the online cannot, with conditions 
c < (x — 1)(T — y) and m < c < y. 

5.2 Worst Case Analysis 

Suppose the online algorithm chooses a value of m = min([(a* — 1)(T — y)\,y). In this case, no integral 
value of c can satisfy the constraints c < (x — 1 )(T — y) and in < c < y in case 3, so the resulting ratio is 
T ~( x ~^( T - y '> or depending on whether or not (x — 1 )(T — y) < y. 

In the other case, suppose the online algorithm chooses to < min(|_(x — 1)(T — y)\,y)- The worst 
competitive ratios that the adversary can return are T 7 r m and y _ m ^ r 1 '^_^ T _ y ^ . The latter expression can be 
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written as 1 — y* m 1 }[ 7 ^ x jrp-y) ■ Because these are both decreasing functions of m, the online algorithm would 
choose m = 0 to obtain a worst case ratio of — . , 1 rv — T . 

y-l+x(T-y) 

The online algorithm, given x , y, T, can choose m to obtain the better competitive ratio. We consider the 
cases to obtain a tight lower bound on the optimal competitive ratio. In the following analysis, let r = y/T 
and a = 1 — r. 

Case 1: (x — 1)(T— y) > y, i.e. x > j^+1 = or a > 1/a;. Here, we also know that [(x—l)(T — y)\ > 
m > 0, where m is an integer, so (x — 1 )(T — y) > 1, i.e. x > + 1 > ^ + 1. Note that the ratio our 

online algorithm can obtain is max(l — r, r _i/j'+ x (i~r) )> where the first is a decreasing function of r and the 
second is increasing. Thus, a lower bound on the max of the two ratios is obtained when we set them equal 
and solve for the optimal a! = 1 /r', i.e. 

---= 1 - r ' 

r' — 1/T + a;(l — r') 

1 = xa' 2 + a' (1 — a! — 1/T) 

0 = (a; - l)a' 2 + (1 - 1/T)a' - 1 

-1 + 1/T + V(1 - 1 /T) 2 + 4(* - 1) 

“ 2 (s-l) 

We confirm that this value of a! satisfies the condition a! £ (0,1), as a' is a decreasing function of x, and at 
the endpoint x = 1 + y, we can compute that 

, _ -1 + 1/T + v / (1-1/T) 2 +4(1/T) _ -1 + 1/T + 1 + 1/T 
“ ““ 2(1/T) _ 2(1/T) 

The value also satisfies a' > 1/x because 


-1 + 1/T + \/(l — 1/T) 2 + 4(x — 1) 
2(x — 1) 


> 1/x 


^ (1 - 1/T ) 2 + 4(x - 1) > (1 - 1/T ) 2 + 4(1 - l/a:)(l - 1/T) + 4(1 - 1/a ;) 2 

1 > -(1 — 1/T) + —y- 

X x z 

4=> x + 1/x — 2 > — 1/T 

which is always true. 

Then the lower bound is 1 — r = a' as given above, which asymptotically in x is 0( l/\fx) . 

Case 2: (a: — 1)(T — y) < y, i.e. x < y— or a < 1/x. Our online algorithm now obtains a ratio 
max(l — (x — 1)(1 — r), r _ 1 / T j_ x ( 1 _ r ) )• Both ratios are now increasing in r, so to lower bound the maximum, 
we set r to minimum, or a to its maximum 1/x. 

We obtain the ratio 

max | 1 — (x — 1 ) — , —| = max ( 1/x, -— | 

V V 2^ + 1 - 1 /T,/ V 2x-l-x/TJ 

Then since we are able to achieve the second ratio, we obtain asymptotically in x a competitive ratio of 

1 

( 2 ^) ' 
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Surprisingly, we are able to obtain an equal asymptotic competitive ratio of 0( 1/y/x) as in the arrival 
time unknown problem, given certain inputs, and a better competitive ratio of 1/(2 — 1/T) for any other 
input. 

6 Generalizations 

In reality, the game of Mousehunt is a lot more complex. In order to better approximate the game, we can 
improve our model of the game and attempt to analyze those models. Thus, we can generalize our model 
further and ask the follow-up questions: 

• What if there are traps L\ , L 2 ■ ■. L m that can catch different mice and have different costs that are 
always available for purchase at a local trapsmith? 

• What if we don’t know x, the effectiveness of Ronza’s trap, ahead of time? 

• What if the mice give reward vectors ( Pi,gi ), where pi is the number of points gained and g, is the 
amount of gold collected from catching mouse 1 ! 

• What if Ronza appears with multiple traps available? What if Ronza appears multiple times? 

• What if there is a probability distribution for the unknown variables in the problem? What can be 
said for some common distributions other than the uniform distribution? What if there are some other 
restrictions on the unknown variables, like upper and lower bounds? 

• What if multiple parameters are unknown at the same time? (For example, if both the cost and the 
arrival time of Ronza’s trap are unknown). 

These problems are much harder to analyze due to the increase in the number of parameters. For example, 
in the case of Ronza appearing multiple times, it becomes important to consider strategies that may buy 
certain Ronza traps early on in order to buy other Ronza traps in the future. Because these problems are 
better approximations of the actual game, they are definitely worth exploring in the future. 

7 Conclusion 

The above analysis shows many interesting results that spring out of a basic model of the game Mousehunt. 
Assuming the basic model of unknown arrival time, if we fix the probability distribution of the arrival times, 
we can model the problem as a Markov Decision Process problem and solve for the optimal deterministic algo¬ 
rithm. If we don’t fix the probability distribution, an optimal deterministic online algorithm can still always 
achieve better than 1/i/x-competitive, and with randomization, can achieve better than 1/2-competitive. 
Using Yao’s minimax principle we show that we can do no better asymptotically than 1/2-competitive over 
all possible values of the parameters, so the asymptotic bound of 1/2-achieved by the randomized algorithm 
is strict. 
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Under the model of unknown cost, an optimal algorithm can actually achieve roughly a ratio of 0(1/y/x) 
in most cases and an even better ratio of ( 2 -i/t) s P ec i a l cases. While these values are asymptotically 
similar to the values obtained for the model of unknown arrival time, the exact expressions for the competitive 
ratios and the analyses of these online algorithms differ greatly. 

This Mousehunt problem appears similar to the well known ski rental problem, which has an optimal 
randomized online algorithm that achieves (e — l)/e-competitiveness. However, the algorithms and analysis 
results for these two problems are quite different, demonstrating the fundamental difference between the 
problems. 

Finally, the Mousehunt problem has many more parameters, and so its optimal competitive ratio will 
depend on more parameters. For most parameter settings the competitive ratio is much better than the 
worst case ratio of 1/2, but for the worst possible parameter settings, the ratio is still 1/2. 

The Mousehunt problem can also be applied to similar situations in other fields. For example, consider 
the problem of maximizing your net worth in life. At every time step, you have to decide between earning 
money now (e.g. working at a grocery store) and preparing yourself for future work by studying hard in 
school, establishing connections, and learning about entrepreneurship. In the case that a golden opportunity 
comes knocking at your door (for example, if a venture capitalist offers to hear your startup sales pitch), 
you could potentially get a huge boost in your pay rate if you impress him enough to have him invest in 
you. However, if you are unprepared and have not worked enough, then you can’t take advantage of the 
opportunity when it comes. Our algorithm details when exactly it is better to earn money now or study 
now to prepare for the future, assuming that you have some estimate on the gain in wealth upon founding 
a startup. 

While these results probably will not affect how people play the game of Mousehunt or how people try 
to maximize their net worth due to the level of complexity of these real situations compared to our model, 
they still show interesting results and methods of analysis of online algorithms, and how randomization 
can be used to improve an algorithm’s competitiveness. Furthermore, this paper demonstrates how even a 
simple online problem can produce complex and unexpected analysis results, such as the asymptotic 1 /y/x 
competitive ratio. 

8 Acknowledgements 

We would like to thank our instructor Prof. Karger for guidance on this project, as well as HitGrab Inc. for 
developing Mousehunt. 


References 

[1] Karlin, Anna R., et al. ’’Competitive randomized algorithms for nonuniform problems.” Algorithmica 
11.6 (1994): 542-571. 

[2] Madry, A. & Panigralii, D. (2011), ’’The Semi-stochastic Ski-rental Problem.” in Supratik Chakraborty 
& Amit Kumar, ed., ’FSTTCS’ , Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, pp. 300-311. 

13 



J. Ling: jefiiing@mit.edu 


K. Xiao: kaix@mit.edu 


D. Yang: zephyred@mit.edu 


[3] Rudolf Fleischer. 2001. On the Bahncard problem. Tlreor. Comput. Sci. 268, 1 (October 2001), 161-174. 
DOI=10.1016/S0304-3975(00)00266-8 http://dx.doi.org/10.1016/S0304-3975(00)00266-8 


14 





